Schema Meta - Matching Extended
نویسندگان
چکیده
Schema matching, the process of matching between concepts describing the meaning of data in heterogeneous, distributed data sources (e.g. database schemata, XML DTDs, HTML form tags, etc.) is one of the basic operations required by the process of data integration. Recently, several algorithms for automatic schema matching have been proposed and evaluated in the database community. While in many domains these tools succeed in finding the right matching, empirical analysis shows that there is not (and probably will never be) any single algorithm that is guaranteed to succeed in all possible domains and applications. To overcome this problem, several tools are being developed that combine the principles by which different algorithms judge the similarity between concepts. In parallel, Anaby-Tavor et al [1] takes another approach, by which not one, but K best-ranked mappings are generated, then examined iteratively until a good mapping is found. In this paper we introduce a novel framework for schema matching which we call schema meta-matching. This approach extends the idea of working with top-K mappings, applying it to an arbitrary ensemble of algorithms for schema matching. Informally, schema metamatching is the problem of computing a “consensus” ranking of alternative mappings between two sets of concepts, given the “individual” graded rankings provided by several algorithms for schema matching. We begin with an overall look at schema matching, concluding with a discussion of how the field is likely to develop in the near future. We formalize the problem of schema meta-matching and introduce several algorithmic solutions for this problem, including one that adapts standard techniques for general quantitative rank aggregation, and others employing novel techniques specific to the problem of schema matching. We provide a formal analysis of the applicability and relative performance of each competing algorithm. Finally, we show how combining these approaches results in the most successful matchings.
منابع مشابه
An Improved Semantic Schema Matching Approach
Schema matching is a critical step in many applications, such as data warehouse loading, Online Analytical Process (OLAP), Data mining, semantic web [2] and schema integration. This task is defined for finding the semantic correspondences between elements of two schemas. Recently, schema matching has found considerable interest in both research and practice. In this paper, we present a new impr...
متن کاملAn Algebraic Framework for Schema Matching
It is well known that a formal framework for the schema matching problem (SMP) is important because it facilitates the building of algorithm model and the evaluation of algorithms. An algebraic framework for schema matching is developed in this paper. First, based on universal algebra, we propose a meta-meta structure for schema, which is named multi-labeled schema. This definition has a distin...
متن کاملDesigning a Benchmark for the Assessment of XML Schema Matching Tools
Over the years, many XML schema matching systems have been developed. A benchmark for assessing the capabilities of schema matching systems and providing uniform conditions and the same testbed for all schema matching prototypes, has become indispensable as the matching systems grow in complexity. However, developing a benchmark for the schema matching problem is very challenging, given the wid...
متن کاملInstance-based Schema Matching for Web Databases by Domain-specific Query Probing
In a Web database that dynamically provides information in response to user queries, two distinct schemas, interface schema (the schema users can query) and result schema (the schema users can browse), are presented to users. Each partially reflects the actual schema of the Web database. Most previous work only studied the problem of schema matching across query interfaces of Web databases. In ...
متن کاملA Hybrid Approach to Schema and Data Integration for Meta-search Engines
In this paper, we describe an approach to schema and data integration for meta-search engines. The integration of heterogeneous, distributed information from the Web is a complicated task, especially the task of schema/data matching and integration. During the matching and integration process, we need to handle syntactic, semantic and structural heterogeneity between multiple information source...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004